1,468 research outputs found

    Microarray foray

    Get PDF

    Insights into the evolution of Darwin's finches from comparative analysis of the Geospiza magnirostris genome sequence

    Full text link
    Background: A classical example of repeated speciation coupled with ecological diversification is the evolution of 14 closely related species of Darwin's (Galápagos) finches (Thraupidae, Passeriformes). Their adaptive radiation in the Galápagos archipelago took place in the last 2-3 million years and some of the molecular mechanisms that led to their diversification are now being elucidated. Here we report evolutionary analyses of genome of the large ground finch, Geospiza magnirostris.Results: 13,291 protein-coding genes were predicted from a 991.0 Mb G. magnirostris genome assembly. We then defined gene orthology relationships and constructed whole genome alignments between the G. magnirostris and other vertebrate genomes. We estimate that 15% of genomic sequence is functionally constrained between G. magnirostris and zebra finch. Genic evolutionary rate comparisons indicate that similar selective pressures acted along the G. magnirostris and zebra finch lineages suggesting that historical effective population size values have been similar in both lineages. 21 otherwise highly conserved genes were identified that each show evidence for positive selection on amino acid changes in the Darwin's finch lineage. Two of these genes (Igf2r and Pou1f1) have been implicated in beak morphology changes in Darwin's finches. Five of 47 genes showing evidence of positive selection in early passerine evolution have cilia related functions, and may be examples of adaptively evolving reproductive proteins.Conclusions: These results provide insights into past evolutionary processes that have shaped G. magnirostris genes and its genome, and provide the necessary foundation upon which to build population genomics resources that will shed light on more contemporaneous adaptive and non-adaptive processes that have contributed to the evolution of the Darwin's finches. © 2013 Rands et al.; licensee BioMed Central Ltd

    The TAO-Gen Algorithm for Identifying Gene Interaction Networks with Application to SOS Repair in E. coli

    Get PDF
    One major unresolved issue in the analysis of gene expression data is the identification and quantification of gene regulatory networks. Several methods have been proposed for identifying gene regulatory networks, but these methods predominantly focus on the use of multiple pairwise comparisons to identify the network structure. In this article, we describe a method for analyzing gene expression data to determine a regulatory structure consistent with an observed set of expression profiles. Unlike other methods this method goes beyond pairwise evaluations by using likelihood-based statistical methods to obtain the network that is most consistent with the complete data set. The proposed algorithm performs accurately for moderate-sized networks with most errors being minor additions of linkages. However, the analysis also indicates that sample sizes may need to be increased to uniquely identify even moderate-sized networks. The method is used to evaluate interactions between genes in the SOS signaling pathway in Escherichia coli using gene expression data where each gene in the network is over-expressed using plasmids inserts

    Probabilistic Clustering of Time-Evolving Distance Data

    Full text link
    We present a novel probabilistic clustering model for objects that are represented via pairwise distances and observed at different time points. The proposed method utilizes the information given by adjacent time points to find the underlying cluster structure and obtain a smooth cluster evolution. This approach allows the number of objects and clusters to differ at every time point, and no identification on the identities of the objects is needed. Further, the model does not require the number of clusters being specified in advance -- they are instead determined automatically using a Dirichlet process prior. We validate our model on synthetic data showing that the proposed method is more accurate than state-of-the-art clustering methods. Finally, we use our dynamic clustering model to analyze and illustrate the evolution of brain cancer patients over time

    Anticancer drug clustering in lung cancer based on gene expression profiles and sensitivity database

    Get PDF
    BACKGROUND: The effect of current therapies in improving the survival of lung cancer patients remains far from satisfactory. It is consequently desirable to find more appropriate therapeutic opportunities based on informed insights. A molecular pharmacological analysis was undertaken to design an improved chemotherapeutic strategy for advanced lung cancer. METHODS: We related the cytotoxic activity of each of commonly used anti-cancer agents (docetaxel, paclitaxel, gemcitabine, vinorelbine, 5-FU, SN38, cisplatin (CDDP), and carboplatin (CBDCA)) to corresponding expression pattern in each of the cell lines using a modified NCI program. RESULTS: We performed gene expression analysis in lung cancer cell lines using cDNA filter and high-density oligonucleotide arrays. We also examined the sensitivity of these cell lines to these drugs via MTT assay. To obtain our reproducible gene-drug sensitivity correlation data, we separately analyzed two sets of lung cancer cell lines, namely 10 and 19. In our gene-drug correlation analyses, gemcitabine consistently belonged to an isolated cluster in a reproducible fashion. On the other hand, docetaxel, paclitaxel, 5-FU, SN-38, CBDCA and CDDP were gathered together into one large cluster. CONCLUSION: These results suggest that chemotherapy regimens including gemcitabine should be evaluated in second-line chemotherapy in cases where the first-line chemotherapy did not include this drug. Gene expression-drug sensitivity correlations, as provided by the NCI program, may yield improved therapeutic options for treatment of specific tumor types

    Microarray data analysis in neoadjuvant biomarker studies in estrogen receptor-positive breast cancer

    Get PDF
    Microarray data have been widely utilized to discover biomarkers predictive of response to endocrine therapy in estrogen receptor-positive breast cancer. Typically, these data have focused on analyses conducted on the diagnostic specimen. However, dynamic temporal changes in gene expression associated with treatment may deliver significant improvements to the current generation of predictive models. We present and discuss some statistical issues relevant to the paper by Taylor and colleagues, who conducted studies to model the prognostic potential of gene expression changes that occur after endocrine treatment

    Relationship between gene co-expression and probe localization on microarray slides

    Get PDF
    BACKGROUND: Microarray technology allows simultaneous measurement of thousands of genes in a single experiment. This is a potentially useful tool for evaluating co-expression of genes and extraction of useful functional and chromosomal structural information about genes. RESULTS: In this work we studied the association between the co-expression of genes, their location on the chromosome and their location on the microarray slides by analyzing a number of eukaryotic expression datasets, derived from the S. cerevisiae, C. elegans, and D. melanogaster. We find that in several different yeast microarray experiments the distribution of the number of gene pairs with correlated expression profiles as a function of chromosomal spacing is peaked at short separations and has two superimposed periodicities. The longer periodicity has a spacing of 22 genes (~42 Kb), and the shorter periodicity is 2 genes (~4 Kb). CONCLUSION: The relative positioning of DNA probes on microarray slides and source plates introduces subtle but significant correlations between pairs of genes. Careful consideration of this spatial artifact is important for analysis of microarray expression data. It is particularly relevant to recent microarray analyses that suggest that co-expressed genes cluster along chromosomes or are spaced by multiples of a fixed number of genes along the chromosome

    Unsupervised reduction of random noise in complex data by a row-specific, sorted principal component-guided method

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Large biological data sets, such as expression profiles, benefit from reduction of random noise. Principal component (PC) analysis has been used for this purpose, but it tends to remove small features as well as random noise.</p> <p>Results</p> <p>We interpreted the PCs as a mere signal-rich coordinate system and sorted the squared PC-coordinates of each row in descending order. The sorted squared PC-coordinates were compared with the distribution of the ordered squared random noise, and PC-coordinates for insignificant contributions were treated as random noise and nullified. The processed data were transformed back to the initial coordinates as noise-reduced data. To increase the sensitivity of signal capture and reduce the effects of stochastic noise, this procedure was applied to multiple small subsets of rows randomly sampled from a large data set, and the results corresponding to each row of the data set from multiple subsets were averaged. We call this procedure Row-specific, Sorted PRincipal component-guided Noise Reduction (RSPR-NR). Robust performance of RSPR-NR, measured by noise reduction and retention of small features, was demonstrated using simulated data sets. Furthermore, when applied to an actual expression profile data set, RSPR-NR preferentially increased the correlations between genes that share the same Gene Ontology terms, strongly suggesting reduction of random noise in the data set.</p> <p>Conclusion</p> <p>RSPR-NR is a robust random noise reduction method that retains small features well. It should be useful in improving the quality of large biological data sets.</p
    • …
    corecore